Eau Claire
Learning to Decode: Reinforcement Learning for Decoding of Sparse Graph-Based Channel Codes
We show in this work that reinforcement learning can be successfully applied to decoding short to moderate length sparse graph-based channel codes. Specifically, we focus on low-density parity check (LDPC) codes, which for example have been standardized in the context of 5G cellular communication systems due to their excellent error correcting performance. These codes are typically decoded via belief propagation iterative decoding on the corresponding bipartite (Tanner) graph of the code via flooding, i.e., all check and variable nodes in the Tanner graph are updated at once. In contrast, in this paper we utilize a sequential update policy which selects the optimum check node (CN) scheduling in order to improve decoding performance. In particular, we model the CN update process as a multi-armed bandit process with dependent arms and employ a Q-learning scheme for optimizing the CN scheduling policy. In order to reduce the learning complexity, we propose a novel graph-induced CN clustering approach to partition the state space in such a way that dependencies between clusters are minimized. Our results show that compared to other decoding approaches from the literature, the proposed reinforcement learning scheme not only significantly improves the decoding performance, but also reduces the decoding complexity dramatically once the scheduling policy is learned.
- North America > United States > Wisconsin > Eau Claire County > Eau Claire (0.14)
- North America > United States > New Jersey > Essex County > Newark (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning
Abdulhai, Marwa, Cheng, Ryan, Clay, Donovan, Althoff, Tim, Levine, Sergey, Jaques, Natasha
Large Language Models (LLMs) are increasingly used to simulate human users in interactive settings such as therapy, education, and social role-play. While these simulations enable scalable training and evaluation of AI agents, off-the-shelf LLMs often drift from their assigned personas, contradict earlier statements, or abandon role-appropriate behavior. We introduce a unified framework for evaluating and improving persona consistency in LLM-generated dialogue. We define three automatic metrics: prompt-to-line consistency, line-to-line consistency, and Q&A consistency, that capture different types of persona drift and validate each against human annotations. Using these metrics as reward signals, we apply multi-turn reinforcement learning to fine-tune LLMs for three user roles: a patient, a student, and a social chat partner. Our method reduces inconsistency by over 55%, resulting in more coherent and faithful simulated users.
- North America > United States > Wisconsin > Eau Claire County > Eau Claire (0.14)
- North America > United States > Virginia (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (12 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (0.93)
- Personal > Interview (0.92)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Consumer Health (1.00)
- Government (1.00)
- Education > Educational Setting > K-12 Education (1.00)
Ask What Your Country Can Do For You: Towards a Public Red Teaming Model
Kennedy, Wm. Matthew, Patlak, Cigdem, Dave, Jayraj, Chambers, Blake, Dhanotiya, Aayush, Ramiah, Darshini, Schwartz, Reva, Hagen, Jack, Kundu, Akash, Pendharkar, Mouni, Baisley, Liam, Skeadas, Theodora, Chowdhury, Rumman
AI systems have the potential to produce both benefits and harms, but without rigorous and ongoing adversarial evaluation, AI actors will struggle to assess the breadth and magnitude of the AI risk surface. Researchers from the field of systems design have developed several effective sociotechnical AI evaluation and red teaming techniques targeting bias, hate speech, mis/disinformation, and other documented harm classes. However, as increasingly sophisticated AI systems are released into high-stakes sectors (such as education, healthcare, and intelligence-gathering), our current evaluation and monitoring methods are proving less and less capable of delivering effective oversight. In order to actually deliver responsible AI and to ensure AI's harms are fully understood and its security vulnerabilities mitigated, pioneering new approaches to close this "responsibility gap" are now more urgent than ever. In this paper, we propose one such approach, the cooperative public AI red-teaming exercise, and discuss early results of its prior pilot implementations. This approach is intertwined with CAMLIS itself: the first in-person public demonstrator exercise was held in conjunction with CAMLIS 2024. We review the operational design and results of this exercise, the prior National Institute of Standards and Technology (NIST)'s Assessing the Risks and Impacts of AI (ARIA) pilot exercise, and another similar exercise conducted with the Singapore Infocomm Media Development Authority (IMDA). Ultimately, we argue that this approach is both capable of delivering meaningful results and is also scalable to many AI developing jurisdictions.
- Asia > Singapore (0.36)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Wisconsin > Eau Claire County > Eau Claire (0.04)
- (6 more...)
- North America > United States > Wisconsin > Eau Claire County > Eau Claire (0.14)
- North America > United States > New Jersey > Essex County > Newark (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Adaptive-VP: A Framework for LLM-Based Virtual Patients that Adapts to Trainees' Dialogue to Facilitate Nurse Communication Training
Lee, Keyeun, Lee, Seolhee, Kim, Esther Hehsun, Ko, Yena, Eun, Jinsu, Kim, Dahee, Cho, Hyewon, Zhu, Haiyi, Kraut, Robert E., Suh, Eunyoung, Kim, Eun-mee, Lim, Hajin
Effective communication training is essential to preparing nurses for high-quality patient care. While standardized patient (SP) simulations provide valuable experiential learning, they are often costly and inflexible. Virtual patient (VP) systems offer a scalable alternative, but most fail to adapt to the varying communication skills of trainees. In particular, when trainees respond ineffectively, VPs should escalate in hostility or become uncooperative--yet this level of adaptive interaction remains largely unsupported. To address this gap, we introduce Adaptive-VP, a VP dialogue generation framework that leverages large language models (LLMs) to dynamically adapt VP behavior based on trainee input. The framework features a pipeline for constructing clinically grounded yet flexible VP scenarios and a modular system for assessing trainee communication and adjusting VP responses in real time, while ensuring learner safety. We validated Adaptive-VP by simulating challenging patient conversations. Automated evaluation using a corpus from practicing nurses showed that our communication skill evaluation mechanism reflected real-world proficiency levels. Expert nurses further confirmed that Adaptive-VP produced more natural and realistic interactions than existing approaches, demonstrating its potential as a scalable and effective tool for nursing communication training.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > Texas (0.04)
- (7 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Instructional Material (1.00)
Gumbel Counterfactual Generation From Language Models
Ravfogel, Shauli, Svete, Anej, Snæbjarnarson, Vésteinn, Cotterell, Ryan
Understanding and manipulating the causal generation mechanisms in language models is essential for controlling their behavior. Previous work has primarily relied on techniques such as representation surgery -- e.g., model ablations or manipulation of linear subspaces tied to specific concepts -- to \emph{intervene} on these models. To understand the impact of interventions precisely, it is useful to examine counterfactuals -- e.g., how a given sentence would have appeared had it been generated by the model following a specific intervention. We highlight that counterfactual reasoning is conceptually distinct from interventions, as articulated in Pearl's causal hierarchy. Based on this observation, we propose a framework for generating true string counterfactuals by reformulating language models as a structural equation model using the Gumbel-max trick, which we called Gumbel counterfactual generation. This reformulation allows us to model the joint distribution over original strings and their counterfactuals resulting from the same instantiation of the sampling noise. We develop an algorithm based on hindsight Gumbel sampling that allows us to infer the latent noise variables and generate counterfactuals of observed strings. Our experiments demonstrate that the approach produces meaningful counterfactuals while at the same time showing that commonly used intervention techniques have considerable undesired side effects.
- Europe > Ireland (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- (34 more...)
- Media (1.00)
- Leisure & Entertainment (1.00)
- Education (1.00)
- (4 more...)
Exploration and Evaluation of Bias in Cyberbullying Detection with Machine Learning
Root, Andrew, Jakubowski, Liam, Vanamala, Mounika
It is well known that the usefulness of a machine learning model is due to its ability to generalize to unseen data. This study uses three popular cyberbullying datasets to explore the effects of data, how it's collected, and how it's labeled, on the resulting machine learning models. The bias introduced from differing definitions of cyberbullying and from data collection is discussed in detail. An emphasis is made on the impact of dataset expansion methods, which utilize current data points to fetch and label new ones. Furthermore, explicit testing is performed to evaluate the ability of a model to generalize to unseen datasets through cross-dataset evaluation. As hypothesized, the models have a significant drop in the Macro F1 Score, with an average drop of 0.222. As such, this study effectively highlights the importance of dataset curation and cross-dataset testing for creating models with real-world applicability. The experiments and other code can be found at https://github.com/rootdrew27/cyberbullying-ml.
- North America > United States > Wisconsin > Eau Claire County > Eau Claire (0.15)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
The Role of Emotions in Informational Support Question-Response Pairs in Online Health Communities: A Multimodal Deep Learning Approach
Jozani, Mohsen, Williams, Jason A., Aleroud, Ahmed, Bhagat, Sarbottam
This study explores the relationship between informational support seeking questions, responses, and helpfulness ratings in online health communities. We created a labeled data set of question-response pairs and developed multimodal machine learning and deep learning models to reliably predict informational support questions and responses. We employed explainable AI to reveal the emotions embedded in informational support exchanges, demonstrating the importance of emotion in providing informational support. This complex interplay between emotional and informational support has not been previously researched. The study refines social support theory and lays the groundwork for the development of user decision aids. Further implications are discussed.
- North America > United States > Wisconsin > Eau Claire County > Eau Claire (0.14)
- North America > United States > Hawaii (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > China (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
VALID: a Validated Algorithm for Learning in Decentralized Networks with Possible Adversarial Presence
Bakshi, Mayank, Ghasvarianjahromi, Sara, Yakimenka, Yauhen, Beemer, Allison, Kosut, Oliver, Kliewer, Joerg
We introduce the paradigm of validated decentralized learning for undirected networks with heterogeneous data and possible adversarial infiltration. We require (a) convergence to a global empirical loss minimizer when adversaries are absent, and (b) either detection of adversarial presence or convergence to an admissible consensus model in their presence. This contrasts sharply with the traditional byzantine-robustness requirement of convergence to an admissible consensus irrespective of the adversarial configuration. A distinctive aspect of our study is a heterogeneity metric based on the norms of individual agents' gradients computed at the global empirical loss minimizer. Machine learning is increasingly reliant on data from a variety of distributed sources. As such, it may be difficult to ensure that the data which originates from these sources is trustworthy. Thus, there is a need to develop distributed and decentralized learning strategies that can respond to bad or even malicious data. However, worst-case or Byzantine resilience is an extremely strong requirement, that performance be maintained if a malicious adversary controls a subset of the processing nodes and takes any conceivable action. In practice, an adversary launching such an attack against a learning process requires tremendous resources which may not be worth the cost to influence the learned model. Thus, even though malicious adversaries are a threat, for the vast majority of the time, they are not present. An algorithm that maintains Byzantine robustness necessarily sacrifices performance when no adversaries are present.
- North America > United States > Wisconsin > Eau Claire County > Eau Claire (0.04)
- North America > United States > New Jersey (0.04)
- North America > United States > Arizona (0.04)
- Europe > Italy (0.04)
Recent Advancements In The Field Of Deepfake Detection
Krueger, Natalie, Vanamala, Dr. Mounika, Dave, Dr. Rushit
A deepfake is a photo or video of a person whose image has been digitally altered or partially replaced with an image of someone else. Deepfakes have the potential to cause a variety of problems and are often used maliciously. A common usage is altering videos of prominent political figures and celebrities. These deepfakes can portray them making offensive, problematic, and/or untrue statements. Current deepfakes can be very realistic, and when used in this way, can spread panic and even influence elections and political opinions. There are many deepfake detection strategies currently in use but finding the most comprehensive and universal method is critical. So, in this survey we will address the problems of malicious deepfake creation and the lack of universal deepfake detection methods. Our objective is to survey and analyze a variety of current methods and advances in the field of deepfake detection.
- North America > United States > Wisconsin > Eau Claire County > Eau Claire (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Tennessee > Davidson County > Nashville (0.04)
- (7 more...)